Normal view MARC view ISBD view

Performance of Speaker Recognition System in Mismatching Speaking Style

By: Brahmbhatt, Pinky J.
Contributor(s): Maradia, K. G.
Publisher: New Delhi STM Journals 2018Edition: Vol 5 (3), Sep - Dec.Description: 58-65p.Subject(s): Computer EngineeringOnline resources: Click Here In: Journal of artificial intelligence research and advances (JoAIRA)Summary: Analysis of speaker recognition system under different speaking style and mismatch conditions is presented. Generally, in text independent speaker recognition system, the way of speaking style used remains the same in training and testing phase. Whisper speech is generally used quietly to convey secret information or to avoid disturbing others in quiet place, so hearing of speech is limited to nearby listener only. Source of excitation and vocal tractsystem contains speaker specific information. Vocal folds do not vibrate in whisper speech so source excitation related information is not present. Fast speech is a tendency to speak rapidly, as if motivated by urgency unobvious to listener. The CHAINS speech corpus: CHAracterizing INdividual Speakers database in whisper, solo and fast speaking style is used for performing experiments with Gaussian Mixture Modeling- Universal Background Modeling (GMM-UBM) approach with the most widely used feature Mel Frequency Cepstral Coefficients (MFCC). The mismatch condition in training and testing speaking style is observed from the experiments and the performance degradation is found to be high when mismatch of train-test condition is considered.
Tags from this library: No tags from this library for this title. Log in to add tags.
    average rating: 0.0 (0 votes)
Item type Current location Call number Status Date due Barcode Item holds
Articles Abstract Database Articles Abstract Database School of Engineering & Technology
Archieval Section
Not for loan 2021-2021402
Total holds: 0

Analysis of speaker recognition system under different speaking style and mismatch conditions is presented. Generally, in text independent speaker recognition system, the way of speaking style used remains the same in training and testing phase. Whisper speech is generally used quietly to convey secret information or to avoid disturbing others in quiet place, so hearing of speech is limited to nearby listener only. Source of excitation and vocal tractsystem contains speaker specific information. Vocal folds do not vibrate in whisper speech so source excitation related information is not present. Fast speech is a tendency to speak rapidly, as if motivated by urgency unobvious to listener. The CHAINS speech corpus: CHAracterizing INdividual Speakers database in whisper, solo and fast speaking style is used for performing experiments with Gaussian Mixture Modeling- Universal Background Modeling (GMM-UBM) approach with the most widely used feature Mel Frequency Cepstral Coefficients (MFCC). The mismatch condition in training and testing speaking style is observed from the experiments and the performance degradation is found to be high when mismatch of train-test condition is considered.

There are no comments for this item.

Log in to your account to post a comment.

Click on an image to view it in the image viewer

Unique Visitors hit counter Total Page Views free counter
Implemented and Maintained by AIKTC-KRRC (Central Library).
For any Suggestions/Query Contact to library or Email: librarian@aiktc.ac.in | Ph:+91 22 27481247
Website/OPAC best viewed in Mozilla Browser in 1366X768 Resolution.

Powered by Koha